Association between phenotypic and genotypic characteristics and disease severity in individuals with cystic fibrosis

Abstract Objective: To analyze the association between phenotypic and genotypic characteristics and disease severity in individuals with cystic fibrosis treated at a reference center in Minas Gerais, Brazil. Methods: This is a retrospective study that collected clinical and laboratory data, respiratory and gastrointestinal manifestations, type of treatment, Shwachman-Kulczycki score, and mutations from the patients’ medical records. Results: The sample included 50 participants aged one to 33 years, 50% of whom were female. Out of the one hundred alleles of the Cystic Fibrosis Transmembrane Conductance Regulator gene, the most prevalent mutations were DeltaF508 (45%) and S4X (18%). Mutation groups were only associated with pancreatic insufficiency (p=0.013) and not with disease severity (p=0.073). The latter presented an association with colonization by Pseudomonas aeruginosa and Staphylococcus aureus (p=0.007) and with underweight (p=0.036). Death was associated with age at diagnosis (p=0.016), respiratory symptomatology (p=0.013), colonization (p=0.024), underweight (p=0.017), and hospitalization (p=0.003). Conclusions: We could identify the association of mutations with pancreatic insufficiency; the association of Staphylococcus aureus colonization and underweight with disease severity; and the lack of association between mutations and disease severity. Environmental factors should be investigated more thoroughly since they seem to have an important effect on disease severity.


INTRODUCTION
Cystic fibrosis (CF) is an autosomal recessive disorder caused by mutations in the Cystic Fibrosis Transmembrane Conductance Regulator (CFTR) gene, located in the long arm of chromosome 7, which encodes the CFTR protein, important for maintaining electrolyte balance. 1 The worldwide prevalence of CF is about 1:2,500 to 3,000, while in Brazil, it corresponds to 1:10,000 live births. 2,3 More than 2 thousand CFTR gene mutations have been reported. 4 They affect the synthesis and function of the CFTR protein, characterizing the disease as multisystemic. 1,5 Mutations in the CFTR gene are divided into six functional classes. Usually, classes I, II, III, and VI cause loss of CFTR function or expression and are associated with more severe phenotypes. Classes IV and V reduce CFTR function or expression, causing mild to moderate disease phenotypes. 5 The molecular analysis of the CFTR gene and the attempt to classify mutations into functional classes can optimize the treatment of affected individuals, contributing to identifying those eligible for specific mutation therapies, which act as CFTR correctors or potentiators. 6 However, the correlation between genotype and phenotype has not been fully explained, given the heterogeneity of mutations, clinical manifestations, and other genetic and environmental factors involved in the modulation of this disease. 7 Furthermore, the functional classification of mutations seems to have some limitations and should not be used alone to determine the relationship between genotype and phenotype. 7 Considering this context, the present study aimed to analyze the association between phenotypic and genotypic characteristics and disease severity in individuals with CF treated at a reference center in Minas Gerais, Brazil.

METHOD
This is a retrospective cross-sectional study. The research protocol was approved by the Human Research Ethics Committee of the institution, according to Resolution No. 466/12 of the Brazilian National Health Council (Opinion No. 3,669,592).
The study included individuals diagnosed with CF (confirmed by two sweat tests), followed by the multidisciplinary team of a reference center, who had a report of the molecular analysis of the CFTR gene. Out of 57 individuals, we excluded two because they did not have the report mentioned above; four who had no mutation justifying the clinical manifestation of CF; and one due to lack of information in their medical record.
Data were collected from the participants' medical records between June and August 2020, and the findings were transcribed to a form elaborated for the study. Anthropometric and clinical characteristics were collected, as well as information on mutations in the CFTR gene. The variables age, weight, height, respiratory symptomatology, medications used, nutritional status, respiratory physiotherapy, unsupervised airway clearance therapy, and unsupervised physical activity were collected and analyzed considering the records of the last visit. For the variables age at diagnosis, history of hemoptysis, sinusitis, nasal polyps, meconium ileus, pancreatic insufficiency, hepatic fibrosis, hepatic steatosis, gallstones, gastritis, splenomegaly, hepatomegaly, kidney stones, diabetes, and mutation in the CFTR gene, we considered the first entries of each participant's medical records (from birth to last entry). The colonization classification, the pathogen involved in the colonization, and hospitalization were collected based on entries from the previous year.
We analyzed the deaths that occurred in the two years prior to data collection, given the rarity of this event.
Disease severity was assessed using the Shwachman-Kulczycki score, which evaluates clinical characteristics such as overall activity, physical examination, nutrition, and radiological findings. Each of these categories receives a score of 05, 10, 15, 20, or 25. The final score represents the clinical manifestation, corresponding to the sum of the categories and classified as severe (<40), moderate (41-55), mild (56-70), good (71-85), or excellent (86-100). 8 The Shwachman-Kulczycki score was obtained by calculating the mean score of the previous year for each individual. This score is not indicated to analyze disease severity in adults, especially with respect to nutritional status. However, we also considered the parameters weight-for-age, weight-for-height, heightfor-age, and body mass index (BMI) for this assessment. The nutritional status of individuals under 18 years of age was classified according to the World Health Organization (WHO) growth curve . 9 The laboratory data collected from medical records were: sputum cultures from the previous year (to identify the pathogens involved in the colonization); intermittent colonization, when less than 50% of cultures were positive for the same pathogen in the previous year; and chronic colonization, when more than 50% of cultures were positive for the same pathogen in the previous year; in addition to the molecular analysis of the CFTR gene.
Every individual has two alleles of this gene. Thus, the participants were categorized into four groups according to the most prevalent mutation, DeltaF508: DeltaF508/DeltaF508, DeltaF508/other mutation, other mutation/other mutation, and other mutation/absence of mutation. We chose this grouping, already used in previous studies, 10,11 due to the high variability of these mutations in a small sample, in addition to limitations regarding the functional classification of CFTR gene mutations. 7 Participants with only one mutation identified were not excluded from the study if the diagnosis was confirmed by the two sweat tests, with chloride values equal to or greater than 60mEq/L, and by the presence of at least one phenotypic characteristic of CF. Besides, the participants could have another mutated allele that was not identified by the adopted molecular analysis technique, as it used only exon capture followed by next-generation sequencing, not including the intron of the CFTR gene, which could identify other mutations.
We performed a descriptive analysis based on absolute and relative frequencies. Continuous variables (current age and age at diagnosis) were expressed as means and standard deviations or medians, categorical variables as absolute and relative frequencies, and the discrete variable (Shwachman-Kulczycki score) as means, medians, and standard deviations.
We used the Minitab software and the Poisson regression model to analyze the association between disease severity and the following phenotypic variables: respiratory symptomatology, current age, colonization classification, and nutritional status. Spearman's correlation coefficient (Rho) was obtained to assess the relationship between age (numerical variable) and disease severity. 12 The association of mutations (genotypic variables) with pancreatic insufficiency, respiratory symptomatology, and disease severity were analyzed using the Kruskal-Wallis test. This test was also employed to investigate the association of death with age, age at diagnosis, and other variables (gender, pancreatic insufficiency, colonization, respiratory symptomatology, nutritional status, supervised physiotherapy, and unsupervised physiotherapeutic interventions, such as airway clearance therapy, physical activities, and hospitalization). 12 The R software was used for other analyses, considering p<0.05 as statistically significant.

RESULTS
The study had a sample of 50 individuals aged one to 33 years. Among them, 25 (50%) were females, and 37 (74%) were Caucasians.
Only 28 (56%) of the participants presented positive neonatal screening, five (10%) were not screened, 15 (30%) presented negative screening, and two (4%) did not have this information in their medical records. Table 1 presents the participants' clinical and laboratory characteristics. The most prevalent age at diagnosis was the first month of life (44%), and most participants (n=23; 46%) presented an excellent Shwachman-Kulczycki score.
DeltaF508 mutations in heterozygous individuals showed a significant association (p=0.013) with pancreatic insufficiency but not with disease severity (p=0.073) and respiratory symptomatology (p=0.666). After data analysis, the group with the combination other mutation/absence of mutation obtained better severity scores (median Shwachman-Kulczycki score of 95.0). The DeltaF508/other mutation combination was associated with worse Shwachman-Kulczycki scores (median=77.50), and all individuals with this mutation had pancreatic insufficiency.
Disease severity showed a significant association with colonization by Pseudomonas aeruginosa and Staphylococcus aureus (p=0.007) and the underweight nutritional status (p=0.036), that is, these characteristics contribute to disease severity, reducing the mean Shwachman-Kulczycki score (Table 3).
Individuals with typical respiratory symptomatology presented a higher mean score than those with exacerbated symptomatology, that is, typical symptomatology was associated with lower disease severity.
Spearman's correlation coefficient (Rho) indicated that younger individuals tended to have higher Shwachman-Kulczycki scores.
Death was significantly associated with age at diagnosis (p=0.016), as the younger the age at diagnosis, the lower the occurrence of death. In addition, the significant association of death with colonization revealed that individuals with colonization by Pseudomonas aeruginosa and Staphylococcus aureus had a higher risk of death. Exacerbated respiratory symptomatology (p=0.013), underweight (p=0.017), and hospitalization in the previous year (p=0.003) also showed a significant association with death (Table 4).

DISCUSSION
The present study found an association between CFTR gene mutations and pancreatic insufficiency, in addition to an association of disease severity with colonization by Staphylococcus aureus and P. aeruginosa and underweight. Scientific evidence has shown 5,13,14 that CFTR gene mutations are associated with pancreatic insufficiency but have a poor association with respiratory symptomatology. These findings may be related to the important influence of intracellular and physiological lung characteristics and environmental factors on pulmonary function, given the wide range of respiratory symptoms in individuals with CF. 5,[13][14][15][16] Environmental factors associated with pulmonary function in CF cases include the weather, air pollution, tobacco use,  treatment adherence, access to care centers, and socioeconomic aspects. 15 Although the present study did not analyze the association of the functional classification of mutations, other investigations 5,7 showed that, usually, an allele with severe mutation belonging to functional classes I, II, III, or VI only leads to pancreatic insufficiency if paired with another allele with severe mutation. Therefore, one allele with mild mutation -class IV or V -preserves the pancreatic function even if combined with an allele with severe mutation since it presents residual Cl-channel activity in epithelial cell membranes.
CFTR gene mutations showed no significant association with disease severity. This result was similar to that of another study 10 carried out in Brazil, which also analyzed disease severity in individuals with a mean age of 12.38±9.0 years, using the Shwachman-Kulczycki score. The research suggested a greater influence of other factors, such as gene modulation and poor treatment adherence, on the higher early mortality. 10 Most study participants were homozygous (n=15; 30%) or heterozygous (n=15; 30%) for the DeltaF508 mutation, totaling 45 (45%) of one hundred alleles; this is the most prevalent and studied CFTR gene mutation in the world. [2][3][4]7,10 It belongs to class II, meaning lack of functional CFTR protein and association with more severe CF phenotypes, with early respiratory symptoms, reduced pulmonary function, and pancreatic insufficiency. 5 This result reinforces the influence of the environment on these patients' prognosis, as the individuals investigated in this study showed no association between phenotypic and genotypic characteristics.
No association was found between respiratory symptomatology and disease severity, contrary to previous studies. [17][18][19] This finding is probably due to the fact that the present study classified the respiratory manifestation based on clinical analysis, while other investigations used pulmonary function tests. In addition, most individuals were classified as asymptomatic (n=9; 18%) or presenting typical symptoms (n=23; 46%) regarding respiratory symptomatology, and, according to Stollar et al., the analysis of disease severity through the Shwachman-Kulczycki score may be limited when assessing patients with mild lung disease. 19 The most prevalent pathogens were Staphylococcus aureus (n=20; 40%), Pseudomonas aeruginosa (n=7; 14%), and the combined colonization by Staphylococcus aureus and Pseudomonas aeruginosa (n=4; 8%). This finding was similar to that of recent studies, which identified a colonization change in European and American populations, with Pseudomonas aeruginosa as the most prevalent. 20,21 However, a colonization that presented significant association with disease severity was the combination of Staphylococcus aureus and Pseudomonas aeruginosa, in contrast to a similar study, which detected an association between Pseudomonas aeruginosa colonization and CF severity. 17 This fact may be related to the environmental differences between the sites where the studies were conducted. Staphylococcus aureus is the first pathogen to colonize individuals with CF, predisposing them, in adulthood, to Pseudomonas aeruginosa colonization, still considered the main responsible for progressive lung disease and, consequently, for the greater morbidity and mortality in most of these individuals. 16,22 Disease severity also showed a significant association with nutritional status (underweight). According to previous studies, nutritional deficiency is associated with the severity of CFTR gene mutations, worse pulmonary function, exercise intolerance, and lower survival, as well as higher rates of chronic infections. [23][24][25][26] This relationship can be justified by the fact that, in individuals with CF, underweight is caused by malabsorption -mainly due to pancreatic insufficiency -and higher energy expenditure related to excessive respiratory effort resulting from inflammatory conditions and pulmonary infections. 23,24 Consequently, the reduced muscle mass lowers strength, respiratory muscle resistance, and exercise tolerance. 23,24,26 Also, death showed a statistically significant association with age at diagnosis, colonization, respiratory symptomatology, nutritional status, and hospitalization. Late diagnosis is associated with a worse prognosis regarding pulmonary function, pancreatic insufficiency, nutritional status, and survival, 11,27,28 in addition to higher rates of hospitalization and colonization. 28 The implementation of neonatal screening is crucial for early diagnosis and the referral of these individuals to specialized centers without delay. Early follow-up improves the prognosis of the disease and reduces comorbidities. 28 The association of death with colonization, respiratory symptomatology, nutritional status, and hospitalization can be explained by their direct or indirect relationship with respiratory complications, which are the main cause of mortality in CF. 22 Although most participants did not undergo supervised physiotherapy and unsupervised physiotherapeutic interventions (airway clearance therapy and physical activity) -essential for a better disease prognosis 29 -, these aspects showed no significant association with disease severity and death, a result that may be related to other variables not analyzed in the present study, such as frequency, type of exercises and physical activity practiced. The reference center where the study was conducted provides guidance for the practice of respiratory exercises and the importance of physical activity but does not have a specific physiotherapy program for patient rehabilitation. Usually, these patients are from other locations and receive physiotherapy in their city of origin. In addition to respiratory exercises, supervised physical activity has beneficial effects on pulmonary function, as well as anthropometric and biochemical parameters in individuals with CF. 29 This study presents some limitations, such as the assessment of disease severity through the Shwachman-Kulczycki score, which is subjective. However, the participants are followed by the same professionals since the first visits, reducing the subjectivity of the score, which is the most used instrument to monitor CF severity. 19 The score also presents a limitation related to the assessment of nutritional status in adults. In order to minimize this bias, we also considered the parameters weightfor-age, weight-for-height, height-for-age, and BMI.
The assessment of respiratory symptomatology presented limitations as well, as data were collected during a pandemic period, preventing us from obtaining recent respiratory function information. Additionally, we emphasize that, since we used a convenience sample, results such as the association between genotype and disease severity may be different in future studies performed with sample calculation.
In conclusion, CFTR gene mutations were significantly associated with pancreatic insufficiency but not with disease severity. Colonization and nutritional status showed a significant association with disease severity. Moreover, death was significantly associated with age at diagnosis, respiratory symptomatology, nutritional status, and hospitalization. These findings emphasize the importance of early diagnosis associated with the genetic analysis of individuals with CF to provide specific treatments. In addition, nutritional status needs to be carefully evaluated to favor the prognosis and survival of these individuals.
Environmental factors should be investigated more thoroughly since they seem to have an important impact on disease severity, especially on respiratory function. We suggest the performance of prospective studies in the future.

Funding
This study was partly funded by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior -Brasil (CAPES). Grant code 001.